8 research outputs found

    Unsupervised and knowledge-poor approaches to sentiment analysis

    Get PDF
    Sentiment analysis focuses upon automatic classiffication of a document's sentiment (and more generally extraction of opinion from text). Ways of expressing sentiment have been shown to be dependent on what a document is about (domain-dependency). This complicates supervised methods for sentiment analysis which rely on extensive use of training data or linguistic resources that are usually either domain-specific or generic. Both kinds of resources prevent classiffiers from performing well across a range of domains, as this requires appropriate in-domain (domain-specific) data. This thesis presents a novel unsupervised, knowledge-poor approach to sentiment analysis aimed at creating a domain-independent and multilingual sentiment analysis system. The approach extracts domain-specific resources from documents that are to be processed, and uses them for sentiment analysis. This approach does not require any training corpora, large sets of rules or generic sentiment lexicons, which makes it domain- and languageindependent but at the same time able to utilise domain- and language-specific information. The thesis describes and tests the approach, which is applied to diffeerent data, including customer reviews of various types of products, reviews of films and books, and news items; and to four languages: Chinese, English, Russian and Japanese. The approach is applied not only to binary sentiment classiffication, but also to three-way sentiment classiffication (positive, negative and neutral), subjectivity classifiation of documents and sentences, and to the extraction of opinion holders and opinion targets. Experimental results suggest that the approach is often a viable alternative to supervised systems, especially when applied to large document collections

    Basic Units for Chinese Opinionated Information Retrieval

    Get PDF
    This paper presents the results of experiments in which the authors tested different types of features for retrieval of Chinese opinionated texts. We assume that the task of retrieval of opinionated texts (OIR) can be regarded as a subtask of general IR, but with some distinct features. The experiments showed that the best results were obtained from combinating character-based processing, dictionary look up (maximum matching) and a negation check

    Basic Units for Chinese Opinionated Information Retrieval

    No full text
    This paper presents the results of experiments in which the authors tested different types of features for retrieval of Chinese opinionated texts. We assume that the task of retrieval of opinionated texts (OIR) can be regarded as a subtask of general IR, but with some distinct features. The experiments showed that the best results were obtained from combinating character-based processing, dictionary look up (maximum matching) and a negation check

    Automatic seed word selection for unsupervised sentiment classification of Chinese text

    No full text
    We describe and evaluate a new method of automatic seed word selection for unsupervised sentiment classification of product reviews in Chinese. The whole method is unsupervised and does not require any annotated training data; it only requires information about commonly occurring negations and adverbials. Unsupervised techniques are promising for this task since they avoid problems of domain-dependency typically associated with supervised methods. The results obtained are close to those of supervised classifiers and sometimes better, up to an F1 of 92%

    Multilingual opinion holder and target extraction using knowledge-poor techniques

    No full text
    We describe an approach to multilingual sentiment analysis, in particular opinion holder and opinion target extraction, wich requires no annotated data and minimal language-specific input. The approach is based on un supervised, knowledge-poor techniques wich facilitate adaptation to new languages and domains. The system's result are comparable to those of supervised, languaje-specific systems previously applied to the NTCIR-7 MOAT evaluation data

    Comparable Domain Dependency in Sentiment Analysis

    Get PDF
    Sentiment analysis (or opinion mining) is concerned not with the topic of a document, or its factual content, but rather with the opinion expressed in a document. In this paper we present a number of experiments on a word-based sentiment analysis on two corpora representing two related domains: film reviews and book reviews. We find that even close domains are very difficult to process without utilising in-domain data. We also indicate certain characteristics of features that affect cross-domain performance of sentiment classifiers.Анализ ΠΎΡ†Π΅Π½ΠΎΡ‡Π½ΠΎΠΉ ΡΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΉ Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½ Π½Π΅ Π½Π° Π°Π½Π°Π»ΠΈΠ· тСматичСского ΠΈΠ»ΠΈ ΡΠΎΠ΄Π΅Ρ€ΠΆΠ°Ρ‚Π΅Π»ΡŒΠ½ΠΎΠ³ΠΎ ΠΊΠΎΠ½Ρ‚Π΅Π½Ρ‚Π°, Π° Π½Π° Π°Π½Π°Π»ΠΈΠ· содСрТащихся Π² тСкстС ΠΎΡ†Π΅Π½ΠΎΠΊ ΠΈ ΡΡƒΠ±ΡŠΠ΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… высказываний. Π’ настоящСй ΠΏΡƒΠ±Π»ΠΈΠΊΠ°Ρ†ΠΈΠΈ ΠΌΡ‹ прСдставляСм Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ экспСримСнтов ΠΏΠΎ автоматичСскому Π°Π½Π°Π»ΠΈΠ·Ρƒ ΠΎΡ†Π΅Π½ΠΎΡ‡Π½ΠΎΠΉ ΡΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΉ ΠΏΡ€ΠΈ ΠΏΠΎΠΌΠΎΡ‰ΠΈ лСксикона Π½Π° ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»Π΅ Π΄Π²ΡƒΡ… корпусов ΠΆΠ°Π½Ρ€ΠΎΠ²ΠΎ ΠΈ тСматичСски Π±Π»ΠΈΠ·ΠΊΠΈΡ… тСкстов: Ρ€Π΅Π²ΡŒΡŽ Ρ„ΠΈΠ»ΡŒΠΌΠΎΠ² ΠΈ Ρ€Π΅Π²ΡŒΡŽ ΠΊΠ½ΠΈΠ³. ΠœΡ‹ ΠΎΠ±Π½Π°Ρ€ΡƒΠΆΠΈΠ»ΠΈ, Ρ‡Ρ‚ΠΎ Π΄Π°ΠΆΠ΅ для тСматичСски Π±Π»ΠΈΠ·ΠΊΠΈΡ… тСкстов эффСктивная классификация ΠΎΡ†Π΅Π½ΠΊΠΈ Π·Π°Ρ‚Ρ€ΡƒΠ΄Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Π° Π±Π΅Π· использования ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ ΠΈΠ· ΠΎΠ±Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅ΠΌΠΎΠ³ΠΎ корпуса. ΠœΡ‹ Ρ‚Π°ΠΊΠΆΠ΅ выявили ΠΎΠΏΡ€Π΅Π΄Π΅Π»Ρ‘Π½Π½Ρ‹Π΅ характСристики лСксикона, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΎΠΊΠ°Π·Ρ‹Π²Π°ΡŽΡ‚ влияниС Π½Π° ΠΊΠ»Π°ΡΡΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΡŽ ΠΎΡ†Π΅Π½ΠΊΠΈ Π² тСкстС

    Comparable Domain Dependency in Sentiment Analysis

    No full text
    Sentiment analysis (or opinion mining) is concerned not with the topic of a document, or its factual content, but rather with the opinion expressed in a document. In this paper we present a number of experiments on a word-based sentiment analysis on two corpora representing two related domains: film reviews and book reviews. We find that even close domains are very difficult to process without utilising in-domain data. We also indicate certain characteristics of features that affect cross-domain performance of sentiment classifiers.Анализ ΠΎΡ†Π΅Π½ΠΎΡ‡Π½ΠΎΠΉ ΡΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΉ Π½Π°ΠΏΡ€Π°Π²Π»Π΅Π½ Π½Π΅ Π½Π° Π°Π½Π°Π»ΠΈΠ· тСматичСского ΠΈΠ»ΠΈ ΡΠΎΠ΄Π΅Ρ€ΠΆΠ°Ρ‚Π΅Π»ΡŒΠ½ΠΎΠ³ΠΎ ΠΊΠΎΠ½Ρ‚Π΅Π½Ρ‚Π°, Π° Π½Π° Π°Π½Π°Π»ΠΈΠ· содСрТащихся Π² тСкстС ΠΎΡ†Π΅Π½ΠΎΠΊ ΠΈ ΡΡƒΠ±ΡŠΠ΅ΠΊΡ‚ΠΈΠ²Π½Ρ‹Ρ… высказываний. Π’ настоящСй ΠΏΡƒΠ±Π»ΠΈΠΊΠ°Ρ†ΠΈΠΈ ΠΌΡ‹ прСдставляСм Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ экспСримСнтов ΠΏΠΎ автоматичСскому Π°Π½Π°Π»ΠΈΠ·Ρƒ ΠΎΡ†Π΅Π½ΠΎΡ‡Π½ΠΎΠΉ ΡΠΎΡΡ‚Π°Π²Π»ΡΡŽΡ‰Π΅ΠΉ ΠΏΡ€ΠΈ ΠΏΠΎΠΌΠΎΡ‰ΠΈ лСксикона Π½Π° ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π»Π΅ Π΄Π²ΡƒΡ… корпусов ΠΆΠ°Π½Ρ€ΠΎΠ²ΠΎ ΠΈ тСматичСски Π±Π»ΠΈΠ·ΠΊΠΈΡ… тСкстов: Ρ€Π΅Π²ΡŒΡŽ Ρ„ΠΈΠ»ΡŒΠΌΠΎΠ² ΠΈ Ρ€Π΅Π²ΡŒΡŽ ΠΊΠ½ΠΈΠ³. ΠœΡ‹ ΠΎΠ±Π½Π°Ρ€ΡƒΠΆΠΈΠ»ΠΈ, Ρ‡Ρ‚ΠΎ Π΄Π°ΠΆΠ΅ для тСматичСски Π±Π»ΠΈΠ·ΠΊΠΈΡ… тСкстов эффСктивная классификация ΠΎΡ†Π΅Π½ΠΊΠΈ Π·Π°Ρ‚Ρ€ΡƒΠ΄Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Π° Π±Π΅Π· использования ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΠΈ ΠΈΠ· ΠΎΠ±Ρ€Π°Π±Π°Ρ‚Ρ‹Π²Π°Π΅ΠΌΠΎΠ³ΠΎ корпуса. ΠœΡ‹ Ρ‚Π°ΠΊΠΆΠ΅ выявили ΠΎΠΏΡ€Π΅Π΄Π΅Π»Ρ‘Π½Π½Ρ‹Π΅ характСристики лСксикона, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΎΠΊΠ°Π·Ρ‹Π²Π°ΡŽΡ‚ влияниС Π½Π° ΠΊΠ»Π°ΡΡΠΈΡ„ΠΈΠΊΠ°Ρ†ΠΈΡŽ ΠΎΡ†Π΅Π½ΠΊΠΈ Π² тСкстС
    corecore